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ABSTRACT The lack of an experimentally determined 
structure of a target protein frequently limits the application of 
structure-based drug design methods. In an effort to overcome 
this limitation, we have investigated the use of computer 
model-built structures for the identification of previously un- 
known inhibitors of enzymes from two major protease families, 
serine and cysteine proteases. We have successfully used our 
model-built structures to identify computationally and to con- 
firm experimentally the activity of nonpeptidic Inhibitors di- 
rected against important enzymes in the schistosome [2-<4- 
methoxy benzoyl)- 1 -naphthoic acid, A'i = 3 /iM] and malaria 
{oxalic bis[(2-hydroxy*l*naphthyImethylene)hydrazide], IC 5 o 
= 6 /iM} parasite life cycles. 



Proteases are involved in many important biological pro- 
cesses including protein turnover, blood coagulation, com- 
plement activation (1), hormone processing (2), and cancer 
cell invasion (3). Thus, they are frequently chosen as targets 
for drug design and discovery. Noteworthy examples include 
the design of angioten sin-converting enzyme inhibitors for 
the treatment of hypertension (4) and programs to develop 
human immunodeficiency virus protease inhibitors to block 
proliferation of the AIDS virus (5). The critical role proteases 
play in the life cycle of parasitic organisms also makes them 
attractive drug-design targets for these infectious diseases 
(6). 

In the most simple terms, structure-based drug design 
methods identify favorable and unfavorable interactions be- 
tween a potential inhibitor and target receptor and maximize 
the beneficial interactions to increase binding affinity. Ob- 
taining an accurate structure for the receptor or ligand- 
receptor complex is a logical step in this process. X-ray 
crystallography continues to be the source of high-resolution 
information about protein structures. However, considerable 
delays often exist between determining the sequence of a 
protein and solving its structure. Difficulties in protein ex- 
pression and more commonly in protein crystallization can 
delay x-ray structure determination. 

Currently, no general method exists for predicting tertiary 
structure from amino acid sequences. However, when a 
protein target is homologous to another protein or group of 
proteins of known structure, a sensible model structure can 
be proposed. Recent comparisons between model and crystal 
structures permit an assessment of the overall accuracy 
expected from homology model-built structures (7-9). For a 
sequence that is 80% identical to a protein of known struc- 
ture, the expected rms deviation of the core residues is M).6 
A (10). The expected rms deviation increases to 1.8 A when 
the sequences are only 20% identical. However, model-built 
structures could still be useful in finding previously unknown 
lead compounds despite the uncertainties in the lower part of 
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this range if the errors cluster far away from the enzyme 
active site. 

The proteases targeted for inhibitor design in this study are 
important in establishing schistosome infection or necessary 
for the maintenance of malarial infection. Schistosomiasis is 
a snail-borne disease that is contracted by individuals who 
come into contact with the parasites in infested waters. 
Infectious larvae (cercariae) secrete an elastase to invade the 
skin of the human host and initiate infection. Once in the 
circulatory system, the schistosomes mature and reproduce. 
Thousands of eggs become trapped in the portal circulation 
of the liver, and the host immune response leads to portal 
hypertension. The protease that is implicated in skin pene- 
tration has been purified and characterized, and preliminary 
studies suggest that cutaneous application of an inhibitor of 
the cercarial elastase might prevent infection (11). 

The increased incidence of drug-resistant strains of malaria 
(especially Plasmodium falciparum) necessitates the search 
for new therapies. Malaria infection includes an erythrocytic 
phase that is responsible for all the clinical manifestations of 
the disease (12). During this phase, erythrocytic trophozoites 
degrade hemoglobin as a principal source of amino acids. 
Rosenthal and coworkers (13, 14) have identified a critical 
cysteine protease that appears to be involved in the degra- 
dation of hemoglobin, the parasites' primary source of amino 
acids. Blocking this enzyme with cysteine protease inhibitors 
[L-/ranj-epoxysuccinylleucylamido-(4-guanidino)butane 
(E64), benzyloxycarbonyl-Phe-Arg-fluoromethyl ketone] in 
culture arrests further growth and development (15). Thus, 
this enzyme is a promising target for new modes of antima- 
larial chemotherapy. 

METHODS 

Model Construction. Three-dimensional models of the 
structures of cercarial elastase and trophozoite cysteine 
protease were built following the approach of Blundell and 
coworkers (16, 17). Seven mammalian serine proteases, 
bovine chymotrypsin (18), porcine pancreatic elastase (19), 
rat mast cell protease (20), human neutrophil elastase (21), rat 
tonin (22), porcine kallikrein (23), and bovine trypsin (24), 
were used to derive a structural alignment for cercarial 
elastase (25). Papain (26) and actinidin (27) were used for 
trophozoite cysteine protease. The conformations of side 
chains were retained when possible, and the statistically most 
likely rotamer was selected when no conformational infor- 
mation was available (17). Loops were placed by using a 
combination of the loop dictionary and key residue ap- 
proaches (28, 29). The resulting models were refined by 
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energy minimization with the amber potential function (30). 
Models were validated with several computational strategies, 
including qpack to probe side-chain volume (31), the profile 
method of Luthy et ai (32), a Ramachandran map analysis of 
backbone geometry, and solvent-accessibility calculations 
(33). 

Screening the Fine Chemicals Directory Using DOCK3.0. The 
two protease model structures were used as receptors for 
ligand docking. dock3.o is an automatic method to screen 
small-molecule data bases for ligands that could bind to a 
given receptor (34). DOCK3.0 characterizes the grooves and 
invaginations of the active site with sets of overlapping 
spheres. The generated sphere centers constitute an irregular 
grid that can be matched with the atom centers of a potential 
ligand. The quality of fit of a ligand to the binding site is 
judged either by shape complementarity or by a simplified 
molecular mechanics force-field energy (estimated interac- 
tion energy). 

dock J.o was used to search the Fine Chemicals Directory 
(Molecular Design Limited, San Leandro, CA) of 55,313 
commercially available small molecules. The structures of 
the small molecules were obtained computationally by using 
a heuristic algorithm, concord, developed by R. Pearlman at 
the University of Texas. coNCORD-generated structures are 
estimated to be =90% in agreement with those structures 
optimized by molecular mechanics calculations (35). The 
Fine Chemicals Directory was chosen over the Cambridge 
Structural Database of experimentally determined structures 
because of the ease with which interesting compounds could 
be obtained. 

In a typical dock search, the top-scoring 100-200 mole- 
cules are examined with 10-50 of these selected for experi- 
mental testing (36). Because model protein structures were 
used instead of crystallographically determined structures, 
an arbitrarily large number of small molecules were saved. 
For each enzyme system, the 2200 molecules with the best 
shape-complementarity scores and the 2200 with the best 
force-field scores were saved. The resulting 8800 compounds 
were visually screened in the context of the active site t}y 
using the molecular display software midasplus (37). 

Because of the uncertainties inherent in model-built struc- 
tures, the scores generated by DOCK3.0 did not influence the 
visual screening process. Instead, compounds were judged 
solely on how they might interact with the active site in the 
putative ligand-receptor complex. In an effort to be self- 
consistent, the resulting 8800 compounds were screened 
three times. No compounds were selected during the first 
screening in an attempt to get acquainted with the systems. 
During the second and third passes, compounds that filled the 
site and had potential hydrogen-bonding and electrostatic 
interactions were selected for further inspection. Only com- 
pounds that were chosen on both the second and third 
screenings were considered further. From this list, an effort 
was made to choose compounds that were chemically diverse 
and that appeared to interact with the receptor in different 
ways. Fifty-two compounds were ultimately chosen for test- 
ing against the cercarial elastase, and 31 compounds were 
chosen for testing against the trophozoite cysteine protease. 
This screening process took =*1 week of effort. As the 
enzyme-active sites became more familiar with each succes- 
sive pass, the time needed to examine the ligand-receptor 
complex shortened. 

Of the 52 compounds selected for the cercarial elastase, 33 
compounds were from the force-field list, 10 compounds 
were from the shape list, and 9 compounds appeared on both 
lists. Of the 31 compounds selected for the malarial protease, 
20 compounds were from the shape list, and 11 compounds 
were from the force-field list. These compounds were ranked 
as high as 4th and as low as 1939th (out of 2200) by the scores 
generated by DOCK3.0. 



Ki Determination for the Inhibitors Against Cercarial 
Elastase, Chymotrypsin, and Elastase. Cercarial elastase was 
purified as described (38). Initial reaction velocities were 
determined at room temperature for each enzyme by using 
tetrapeptide thiobenzyl ester substrates in the presence of 20 
liM 4,4'-dithiopyridine and following the absorbance at 324 
nM for 1 min after enzyme addition (39). Enzyme concentra- 
tions were determined by active-site titration with chloro- 
methyl ketone inhibitors, and used at 1/ 100th of the lowest 
substrate concentration. The reaction buffer was 100 mM 
glycine-NaOH, pH 9.0/2 mM CaCl 2 . The specific substrates 
used were A^succinylalanylalanylprolylphenylalanyl thioben- 
zyl ester for cercarial elastase and chymotrypsin, and jV-suc- 
cinylalanylalanylprolylalanyl thiobenzyl ester for pancreatic 
elastase at concentrations from 25 to 500 pM. Inhibitors were 
prepared as 100 mM stock solutions in dimethyl sulfoxide and 
used at concentrations from 0 to 100 /xM, Reaction velocities 
were determined in triplicate for each point and plotted by 
using the method of Dixon. Data were also plotted using the 
Hanes transformation of the Michaelis-Menten equation to 
ascertain the competitive nature of inhibition. K\ was deter- 
mined directly from the Dixon plot (40) and confirmed by 
replots of K£ p /V" pp from the Hanes plot (41). 

The Trophozoite Cysteine Protease Inhibitor Studies. En- 
zyme activity was measured with the fluorogenic substrate 
benzyloxycarbonyl-Phe-Arg-(7-amino-4-methylcoumarin) as 
described (15). Trophozoite extracts were incubated with 
reaction buffer (in 0.1 M sodium acetate/10 mM dithiothrei- 
tol, pH 5.5) and an appropriate concentration of inhibitor for 
30 min at room temperature. Benzyloxycarbonyl-Phe-Arg- 
(7-amino-4-methylcoumarin) (50 /iM final concentration) was 
then added, and fluorescence (380 nM excitation, 460 nM 
absorbance) was measured continuously over 30 sec. The 
slope of fluorescence over time for each inhibitor concentra- 
tion was compared with that of controls in multiple assays, 
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Fio. 1 . (o) (0 Naphthol blue-black, (it) 2-(4-Methoxybenzoyl)-l- 
naphthoic acid, (b) Oxalic bis[(2-hydroxy-l-naphthylmethylene)hy- 
drazide]. 
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Table 1. K\ values for compounds that inhibit ccrcarial elastase 
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and the IC 30 was determined from plots of percent control 
activity over inhibitor concentration. 

Effect of Oxalic Bis[(2-hydroxy-l-naphthylmethylene)hy- 
drazlde] on [ 3 H]Hypoxanthine Uptake as a Measure of Parasite 
Metabolism. [ 3 H]Hypoxanthine uptake was measured based 
on a modification of the method of Desjardins et ai (42). 
Microwell cultures of synchronized ring stage P. falciparum 
parasites were incubated with inhibitor in dimethyl sulfoxide 
(10% final concentration) for 4 hr. [ 3 H]Hypoxanthine was 
added (1 /xCi per microwell culture; 1 Ci = 37 GBq), and the 
cultures were maintained for an additional 36 hr. The cells 
were then harvested and deposited onto glass-fiber filters that 
were washed and dried with ethanol. [ 3 H]Hypoxan thine 
uptake was quantitated by scintillation counting. The uptake 
at each inhibitor concentration was compared with that of 
controls, and the ICso value was determined from plots of 
percent control uptake over inhibitor concentration. 

RESULTS AND DISCUSSION 

Nonpeptidic inhibitors were identified for both the cercarial 
elastase and the malarial cysteine protease. Approximately 
10% of the compounds tested, 5 of 52 for the cercarial elastase 
and 4 of 31 for the malarial protease, displayed activity 
against the enzymes at concentrations <100 fjM. Among 
these, three compounds were inhibitors at concentrations 
<10 fiM (Fig. 1), 2-(4-Methoxybenzoyl)-l-naphthoic acid and 
naphthol blue-black inhibited the cercarial elastase with Ki 
values of 3 and 6 jiM, respectively (Table 1 and Fig. 2). These 
two compounds also displayed specificity for the cercariaf 
elastase, as evidenced by the generally higher K; values 
against chymotrypsin and pancreatic elastase (Table 1). Be- 
cause the SI specificity pocket of cercarial elastase is more 
similar to chymotrypsin than to pancreatic elastase, it is not 
surprising that both 2-(4-methoxy benzoyl)-! -naphthoic acid 
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Fig. 2. Representative K\ determination using the Dixon plot. In 
this example, the K, is determined for naphthol blue-black against 
cercarial elastase. Each point was determined in triplicate. Each line 
represents a different substrate concentration (a, 500 ^M; ♦, 200 
jxM; □, 50 fiM). Some error bars are too small to be graphed on this 
plot. V, velocity, 
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and naphthol blue-black are also good inhibitors of chymo- 
trypsin. (Note that the amino acid residues on the acyl side 
of the scissile bond are denoted PI, P2, . . . Pa, and those on 
the leaving group side of the scissile bond are denoted as PI', 
P2', . . , Pn\ The corresponding binding sites on the enzyme 
are SI, S2, . , . Sn and SI', S2', . . . Sn'.) Presumably, the 
application of standard medicinal chemistry strategies to 
these lead compounds will yield more potent and selective 
inhibitors of the schistosome enzyme. Topical application of 
peptide-based inhibitors has already been demonstrated to 
block parasite migration through the skin (11). 

Oxalic bis[(2-hydroxy-l-naphthylmethylene)hydrazide] in- 
hibited the trophozoite cysteine protease with an ICjoof 6 fiM 
(Fig. 3a). When tested against cultured P. falciparum, this 
compound also inhibited the incorporation of hypoxanthine, 
a standard marker of parasite metabolism, at approximately 
the same concentration (Fig. 3b). Because this compound can 
inhibit the protease and the parasite, efforts are underway to 
synthesize analogs of oxalic bis[(2-hydroxy-l-naphthy!meth- 
ylene)hydrazide] and examine their therapeutic potential. 

The visual screening process was reexamined for the most 
active compounds in an attempt to find the relevant factors 
responsible for their selection. An interesting dichotomy was 
observed in the dock shape-based and force-field scores. All 
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Fig. 3. {a) ICjo curve for oxalic bis[(2-hydroxy-l-naphthylmeth- 
ylcne)hydrazide] against malarial cysteine protease. The points are 
the means of eight assays, and the error bars are the SDs of the 
samples, (b) Inhibition of parasite uptake of [ 3 H]hypoxanthine by 
oxalic bis[(2-hydroxy-l-naphthylmethylene)hydrazide). 
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but one of the Five inhibitors of the cercarial elastase were 
members of the force list with the following rankings: 85th, 
2-(4-methoxybenzoyl)-l-naphthoic acid; 122nd, plasmo- 
corinth B; 627th, naphthol blue-black; and 918th, a-pheneth- 
ylphthalamic acid. The fifth compound, 9-fluorenone-4- 
carboxylic acid, appeared on both lists, ranking 561st on the 
force-field list and 1783rd on the shape-based list. The two 
best cercarial elastase inhibitors, 2-(4-methoxy benzoyl)- 1- 
naphthoic acid and naphthol blue-black, ranked 85th and 
627th, respectively, on the force-field list. By contrast, all 
four of the malarial protease inhibitors were members of the 
shape-based list, ranking as follows: 7th, 3,3'-diethyloxatri- 
carbocyanine iodide; 13th, oxalic bis[(2-hydroxy-l- 
naphthylmethylene)hydrazide]; 793rd, cephaloglycin; and 
1193rd, l-(2-methoxyphenyl)-6-(4-trifluoromethylphenyl)-5- 
thiobiurea. The best inhibitor, oxalic bis[(2-hydroxy-l- 
naphthylmethylene)hydrazide], ranked 13th. These results 
may reflect the environmental differences in the active site. 
The active site of the malarial protease consists of a large 
hydrophobic cleft. Because of the absence of charged resi- 
dues in the vicinity of the putative binding site, the shape- 
based scores for hydrophobic ligands that fill the site may 
adequately estimate the enthalpy of interaction between 
ligand and receptor. By contrast, the active site of the 
cercarial elastase contains both a hydrophobic SI pocket and 
charged amino acids in the vicinity of the active site. Con- 
sequently, the force-field scores, which include both van der 
Waals and electrostatic components, better estimate the 



interaction energy of the ligands with the active site of the 
cercarial elastase. 

The DOCK-generated enzyme-inhibitor complex structures 
for naphthol blue-black and oxalic bis[(2-hydroxy-l- 
naphthylmethylene)hydrazide] are shown in Fig. 4. Naphthol 
blue-black fits into the groove defined by the SI, S2, and S3 
subsites of the cercarial elastase. In the model complex, 
ligand binding is stabilized by the interaction of a phenyl 
group with the hydrophobic SI pocket. The sulfonic acid 
groups could hydrogen-bond with arginines in a nearby loop 
or possibly with the solvent. Similarly, oxalic bis [(2-hydroxy- 
l-naphthylmethylene)hydrazide] interacts with S2 and SI' 
sites of the malarial protease. The hydrophobic specificity 
site, S2, ts filled by a naphthol group. The other naphthol 
group participates in a stacking interaction with the indole 
ring of Trp-177 at the SI' site. In addition, each hydroxyl 
group on the naphthol rings appears to hydrogen-bond to 
Ser-160 at S2 and Gln-19 at SI'. These complexes are useful 
starting points for modeling ligand-receptor interactions, but 
other possible binding modes should also be considered. 

At micromolar concentrations, it is likely that the inhibitors 
will have multiple modes of binding to the enzyme. Because 
these different binding modes are approximately isoenergetic, 
discriminating among the plausible alternatives with current 
scoring functions is difficult. Assumptions, such as rigid 
ligands and rigid receptors, are necessary for computational 
tractability but are also presumably responsible for the loss of 
resolution in these scores. The x-ray structures of thymidylate 
synthase complexed with two different inhibitors that were 




Fig. 4. (Upper) Stereo image of naphthol blue-black docked into the active site of cercarial elastase. {Lower) Stereo image of oxalic 
bis[{2-hydroxy-l-naphthylmethylene)hydrazidel docked into the active site of trophozoite cysteine protease. Catalytic residues are colored 
purple and labeled for orientation. The atoms on the inhibitors are color-coded: carbons are white, oxygens are red, nitrogens are blue, and sulfurs 
are cyan. 
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suggested by dock illustrate the challenges presented in 
accurately predicting the ligand-receptor complexes (43). For 
sulisobenzone, the failure to anticipate the binding of a coun- 
terion in the binding site led to an inaccurate prediction of the 
complex. In the case of phenolphthalein, a conformational 
change by an arginine between unbound and bound states of 
the enzyme and the presence of two waters in the bound state 
led to a slightly different conformation of the ligand than the 
one anticipated by dock (43). These examples highlight the 
importance of crystallography to the structure-based drug- 
design process. Ligand-induced conformational changes and 
the presence of bound waters and counterions are details that 
may be necessary for successful lead optimization. 

The quality of the model structure is directly related to the 
percentage sequence identity between the relevant sequences. 
The trophozoite cysteine protease is »33% identical to both 
papain and actinidin, and the cercarial elastase is 20-25% 
identical to the seven mammalian serine proteases of known 
structure. Thus, we anticipate errors of 1-3 A rms deviation in 
the model atomic coordinates, although errors in the vicinity 
of the active site are probably substantially smaller, reflecting 
selective sequence conservation. Two explanations of the 
success of our modeling/docking approach are plausible. (0 
The modeling errors in the active site are small, and the major 
determinants of molecular recognition are faithfully recreated. 
07) Alternatively, the modeling process was irrelevant, and a 
homologous structure could have been substituted for com- 
putational ligand-binding studies to identify lead compounds. 
To address the latter possibility, two homologous serine 
proteases, chymotrypsin and trypsin, were used as receptors 
for ligand docking. Chymotrypsin was chosen because it 
shares with cercarial elastase a similar PI specificity for 
hydrophobic residues. Trypsin was chosen because its SI 
pocket is stericaUy similar, despite its different peptide spec- 
ificity. With the same method, DOCK3.0 was used to search the 
Fine Chemicals Directory, and the top 2200 shape- 
complementarity scoring compounds and the top 2200 force- 
field scoring compounds were saved. 

The. best two inhibitors of the cercarial elastase were not * 
included in either list of 4400 compounds predicted to inhibit 
chymotrypsin or trypsin, although each shape-based list 
included one of the less effective inhibitors. Due to unfavor- 
able interactions seen in the model of 9-fluorenone-4- 
carboxylic acid docked to chymotrypsin (negative charge in 
hydrophobic SI pocket), this compound would have been 
rejected during the visual-screening evaluation. Conse- 
quently, none of the five inhibitors identified for the cercarial 
elastase would have been found in a DOCK3.0 search by using 
the chymotrypsin active site, and only one of the 100 fiM. 
inhibitors, a-phenethylphthalamic acid, would have been 
found by using the trypsin active site. Although we cannot 
rule out finding other low-micro molar inhibitors from the lists 
of compounds generated by the chymotrypsin and trypsin 
searches, our results indicate that the modeling process was 
not irrelevant and that this method for inhibitor discovery is 
sensitive enough to differentiate between similar active sites 
in homologous structures. 

Despite the inherent limitations of computer model-built 
structures, these structures are helpful in finding nonpeptidic 
inhibitors active at low-micromolar concentrations. Al- 
though these compounds are far from being drugs, they are 
sensible starting points for the process of drug development. 
Because these enzymes are members of two major protease 
families, our work suggests that computer models and struc- 
ture-based drug-design methods can be applied to identify 
inhibitors of proteases that are relevant to other pathophys- 
iologic processes. 
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